5 - Machine Learning for Physicists [ID:11487]
50 von 724 angezeigt

Vectors with respect to which you want to express the input.

So if the input, if the string of neuron values you consider a vector,

then being given many different inputs means being given many different vectors.

And the question is what is a good choice for a number of vectors to represent these inputs with?

Okay, so when we went through the mathematics of this we arrived at an object that is really important in this game.

And that is you have to somehow be able to talk about the statistics of input vectors.

And taking a page out of the quantum mechanics textbooks, let's call these vectors psi.

So these are the input vectors. Each new psi will be a different input vector.

And psi m will be one of the components of this vector.

So m would range over the number of input neurons that you have.

And what we figured out is in order to find the best choice for the weights in this neural network, in this linear autoencoder,

what matters is not the average of the vector.

That could be trivial, for example, it could be zero, I will assume it to be zero.

But what matters is something like the variance or more particularly the cross correlation matrix.

So here I take some psi m and then maybe some other component psi n.

And in order to make it similar to quantum mechanics I will even allow for complex valued vectors.

So that's why I put a complex conjugate here. But this is not so important for the present context.

Anyway, we multiply two different components of this vector.

And then we average over all possible input examples.

I mean this particular psi m is a component of this particular input example.

But now I will average this quadratic form. I will average it over all kinds of input vectors.

And so this is obviously some statistical measure that describes your distribution of input vectors.

So for example, it could be that the different components of the input vector are always completely independent random variables more or less.

In which case this would be a diagonal matrix. So only if m equals n then you would get a non-zero contribution.

Because if m equals n what sits inside the brackets is obviously psi m squared.

So this is positive. So this gives you something even after averaging.

But for all the other cases with m is not equal n then this would just average away to zero.

Because if m is not equal to n by assumption we would say they are independent.

So this could be a simple example. But in general they will not be independent.

Even if m is not equal to n this gives something non-zero.

And then this is a very interesting and useful matrix to look at.

And in quantum mechanics this matrix is known as the density matrix.

And this is used to describe exactly situations where your wave function is not quite certain.

So you're talking about a statistical set of wave functions for example in thermal systems.

So the wave function could be this or it could be that and so on with different probabilities.

And then you want to describe this situation of uncertainty by using a density matrix.

Okay and the claim is now that knowing this Hermitian matrix is already good enough to understand

what will be the optimal linear autoencoder for this case.

And the claim is and it would take a few lines the claim is you just need to take this matrix

which by the way is Hermitian and you need to diagonalize it.

And you look at the eigenvalues and the eigenvectors.

And now you just keep the eigenvectors with the largest eigenvalues.

They have a meaning they are so to speak the most important eigenvectors in this game.

For example if your Hilbert space is 10 dimensional let's say so the vectors are of length 10.

But if you're only in the statistical averaging if you only were ever to encounter two different vectors

because 50% this vector 50% that vector no other vector.

Then it would turn out that this Hermitian matrix would only have two non-zero eigenvalues

and the rest would be zero.

And then keeping those two eigenvectors would be absolutely enough to describe everything.

And so just to give a graphical representation of this if I have a 2D Hilbert space

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:23:49 Min

Aufnahmedatum

2019-05-27

Hochgeladen am

2019-05-28 22:11:24

Sprache

en-US

This is a course introducing modern techniques of machine learning, especially deep neural networks, to an audience of physicists. Neural networks can be trained to perform diverse challenging tasks, including image recognition and natural language processing, just by training them on many examples. Neural networks have recently achieved spectacular successes, with their performance often surpassing humans. They are now also being considered more and more for applications in physics, ranging from predictions of material properties to analyzing phase transitions. We will cover the basics of neural networks, convolutional networks, autoencoders, restricted Boltzmann machines, and recurrent neural networks, as well as the recently emerging applications in physics.

Tags

points vectors space matrix memory high layer probability gradients network dimensional distances images neurons components
Einbetten
Wordpress FAU Plugin
iFrame
Teilen